The many successes of deep neural networks (DNNs) over the past decade have largely been driven by computational scale rather than insights from biological intelligence. Here, we explore if these trends have also carried concomitant improvements in explaining the visual strategies humans rely on for object recognition. We do this by comparing two related but distinct properties of visual strategies in humans and DNNs: where they believe important visual features are in images and how they use those features to categorize objects. Across 84 different DNNs trained on ImageNet and three independent datasets measuring the where and the how of human visual strategies for object recognition on those images, we find a systematic trade-off between DNN categorization accuracy and alignment with human visual strategies for object recognition. State-of-the-art DNNs are progressively becoming less aligned with humans as their accuracy improves. We rectify this growing issue with our neural harmonizer: a general-purpose training routine that both aligns DNN and human visual strategies and improves categorization accuracy. Our work represents the first demonstration that the scaling laws that are guiding the design of DNNs today have also produced worse models of human vision. We release our code and data at https://serre-lab.github.io/Harmonization to help the field build more human-like DNNs.
translated by 谷歌翻译
如今,视觉变压器是图像分类任务的事实上的偏好。分类任务有两种类别,即细粒度和粗粒。在细粒度的分类中,由于子类之间的相似性高度相似,因此必须发现细微的差异。当我们降低图像以节省与视觉变压器(VIT)相关的计算成本时,这种区别通常会丢失。在这项工作中,我们介绍了深入的分析,并描述了开发用于从标本室纸的细粒度分类系统的关键组成部分。我们广泛的实验分析表明,需要更好的增强技术以及现代神经网络处理更高维图像的能力。我们还介绍了一个称为“ Anciformer”的卷积变压器体系结构,该体系结构与流行的视觉变压器(Convit)不同,可以处理更高的分辨率图像,而不会爆炸记忆和计算成本。我们还介绍了一种新颖的,改进的预处理技术,称为Presizer,以更好地调整图像大小,同时保留其原始纵横比,这对于对天然植物进行分类至关重要。借助我们简单而有效的方法,我们在202X和Inatorist 2019数据集上实现了SOTA。
translated by 谷歌翻译
我们认为,当学习一个具有最佳运输问题双重损失的1- lipschitz神经网络时,模型的梯度既是运输计划的方向,又是与最接近的对抗性攻击的方向。沿着梯度前往决策边界不再是对抗性攻击,而是反事实的解释,明确地从一个班级运输到另一个班级。通过对XAI指标进行的广泛实验,我们发现应用于此类网络的简单显着性图方法成为可靠的解释,并且在不受约束的模型上胜过最新的解释方法。所提出的网络已经众所周知,可以证明它们也可以通过快速而简单的方法来证明它们也可以解释。
translated by 谷歌翻译
本文提出了一种基于Hilbert-Schmidt独立标准(HSIC)的新有效的黑盒归因方法,这是一种基于再现核Hilbert Spaces(RKHS)的依赖度量。 HSIC测量了基于分布的内核的输入图像区域之间的依赖性和模型的输出。因此,它提供了由RKHS表示功能丰富的解释。可以非常有效地估计HSIC,与其他黑盒归因方法相比,大大降低了计算成本。我们的实验表明,HSIC的速度比以前的最佳黑盒归因方法快8倍,同时忠实。确实,我们改进或匹配了黑盒和白框归因方法的最新方法,用于具有各种最近的模型体系结构的Imagenet上的几个保真度指标。重要的是,我们表明这些进步可以被转化为有效而忠实地解释诸如Yolov4之类的对象检测模型。最后,我们通过提出一种新的内核来扩展传统的归因方法,从而实现基于HSIC的重要性分数的正交分解,从而使我们不仅可以评估每个图像贴片的重要性,还可以评估其成对相互作用的重要性。
translated by 谷歌翻译
当今最先进的机器学习型号几乎无法审查。解释性方法的主要挑战是通过揭示导致给定决定的策略,通过表征其内部状态或研究基础数据表示来帮助研究人员开放这些黑匣子。为了应对这一挑战,我们开发了Xplique:一种用于解释性的软件库,其中包括代表性的解释性方法以及相关的评估指标。它与最受欢迎的学习库之一接口:Tensorflow以及其他图书馆,包括Pytorch,Scikit-Learn和Theano。该代码是根据MIT许可证获得许可的,可在Github.com/deel-ai/xplique上免费获得。
translated by 谷歌翻译
已经提出了多种解释性方法和理论评价分数。然而,尚不清楚:(1)这些方法有多有用的现实情景和(2)理论措施如何预测人类实际使用方法的有用性。为了填补这一差距,我们在规模中进行了人类的心理物理学实验,以评估人类参与者(n = 1,150)以利用代表性归因方法学习预测不同图像分类器的决定的能力。我们的结果表明,用于得分的理论措施可解释方法的反映在现实世界方案中的个人归因方法的实际实用性不佳。此外,个人归因方法帮助人类参与者预测分类器的决策的程度在分类任务和数据集中广泛变化。总体而言,我们的结果突出了该领域的根本挑战 - 建议致力于开发更好的解释方法和部署人以人为本的评估方法。我们将制定框架的代码可用于缓解新颖解释性方法的系统评估。
translated by 谷歌翻译
我们描述了一种新颖的归因方法,它基于敏感性分析并使用Sobol指数。除了模拟图像区域的个人贡献之外,索尔索尔指标提供了一种有效的方法来通过方差镜头捕获图像区域与其对神经网络的预测的贡献之间的高阶相互作用。我们描述了一种通过使用扰动掩模与有效估计器耦合的扰动掩模来计算用于高维问题的这些指标的方法,以处理图像的高维度。重要的是,我们表明,与其他黑盒方法相比,该方法对视觉(和语言模型)的标准基准测试的标准基准有利地导致了有利的分数 - 甚至超过最先进的白色的准确性 - 需要访问内部表示的箱方法。我们的代码是免费的:https://github.com/fel-thomas/sobol-attribution-method
translated by 谷歌翻译
已经提出了一种夸张的方法来解释深度神经网络如何达到他们的决策,但相比之下,已经做出了很少的努力,以确保这些方法产生的解释是客观相关的。虽然制定了一些可信赖的解释的若干理想的性质,但客观措施越来越难以得出。在这里,我们提出了两项​​新措施来评估从算法稳定性领域借来的解释:意味着普通象征性和相对一致性的重读。我们对不同的网络架构,常见解释性方法和几个图像数据集进行广泛的实验,以证明提出措施的好处。与我们的策略相比,流行的保真度措施不足以保证值得信赖的解释。最后,我们发现1-Lipschitz网络在达到类似的准确度的同时,具有比普通神经网络更高的象征和重新遗传的解释。这表明1-lipschitz网络是朝着更可解释和值得信赖的预测器的相关方向。
translated by 谷歌翻译
View-dependent effects such as reflections pose a substantial challenge for image-based and neural rendering algorithms. Above all, curved reflectors are particularly hard, as they lead to highly non-linear reflection flows as the camera moves. We introduce a new point-based representation to compute Neural Point Catacaustics allowing novel-view synthesis of scenes with curved reflectors, from a set of casually-captured input photos. At the core of our method is a neural warp field that models catacaustic trajectories of reflections, so complex specular effects can be rendered using efficient point splatting in conjunction with a neural renderer. One of our key contributions is the explicit representation of reflections with a reflection point cloud which is displaced by the neural warp field, and a primary point cloud which is optimized to represent the rest of the scene. After a short manual annotation step, our approach allows interactive high-quality renderings of novel views with accurate reflection flow. Additionally, the explicit representation of reflection flow supports several forms of scene manipulation in captured scenes, such as reflection editing, cloning of specular objects, reflection tracking across views, and comfortable stereo viewing. We provide the source code and other supplemental material on https://repo-sam.inria.fr/ fungraph/neural_catacaustics/
translated by 谷歌翻译
Edge computing is changing the face of many industries and services. Common edge computing models offload computing which is prone to security risks and privacy violation. However, advances in deep learning enabled Internet of Things (IoTs) to take decisions and run cognitive tasks locally. This research introduces a decentralized-control edge model where most computation and decisions are moved to the IoT level. The model aims at decreasing communication to the edge which in return enhances efficiency and decreases latency. The model also avoids data transfer which raises security and privacy risks. To examine the model, we developed SAFEMYRIDES, a scene-aware ridesharing monitoring system where smart phones are detecting violations at the runtime. Current real-time monitoring systems are costly and require continuous network connectivity. The system uses optimized deep learning that run locally on IoTs to detect violations in ridesharing and record violation incidences. The system would enhance safety and security in ridesharing without violating privacy.
translated by 谷歌翻译